Introduction: During the past two decades, our knowledge about the molecular pathogenesis of myeloid neoplasms (MNs), including acute myeloid leukemia (AML) and myelodysplastic syndromes (MDS) has dramatically been improved through the identification of major driver alterations using next generation sequencing. However, with the lack of large-scale analysis using whole genome sequencing (WGS) with sufficient sequencing depths, the analyses have mainly focused on the alterations affecting protein-coding sequences, while the role of non-coding alterations and structural variations (SVs) in myeloid leukemogenesis has not fully been investigated. The etiology of mutagenic processes has also been poorly understood. In the current study, we therefore conducted large-scale deep WGS to address these issues.

Methods: We performed deep WGS with matched normal samples (mean sequence depth 127.0/30.3 for tumor/normal, respectively) for a total of 903 patients with MN (494 AML and 419 MDS), with an additional 263 cases (94 AML and 169 MDS) currently under analysis. Most cases were also analyzed using deep targeted capture sequencing (n=886) of 446 known/putative driver genes. In addition, RNA sequencing was newly performed for nearly half of cases (n=406). Mutation calling was performed using Mutect2 and GRIDSS2, with in-house modifications to control false positivity in the face of frequent contaminations of tumor cells into germline samples.

Results: We identified 1,159,638 SNVs and 76,842 sInDels with median mutation burden of 0.46 SNVs and 0.030 sInDels per Mb in AML and 0.39 and 0.022 per Mb in MDS. Validation by deep targeted capture sequencing estimated a true-positive rate of 95% and 78.1% sensitivity for known drivers. In the analysis of mutational signatures, we found that clock-like signatures SBS1 and SBS5 were predominant. SBS19 and SBS32 were detected in 62%/14% and 68%/14% of AML and MDS cases, respectively. As expected, chemotherapy related signatures (SBS25, 31 and 99) were significantly enriched in cases with therapy-related MN cases. The SBS18 signature, which is implicated in oxidative stress–induced mutagenesis, was highly enriched in AML cases carrying t(8;21) and inv(16), but rarely found in those with MDS, highlighting the role of oxidative stress in core binding factor leukemia. Other novel signatures such as SBS24, 39 and 40b were detected in a small number of cases (~1%).

Our study of a large cohort of MNs also revealed novel candidate driver genes. By evaluating mutation enrichment, we identified a total of 82 candidates of driver genes. Among these, 15 genes were not previously reported as driver genes in MNs, including 7 genes listed in COSMIC cancer gene census database as associated with other cancer types. RNA-seq enabled the identification of intronic mutations predicted to cause alternative splicing in known driver genes, including TET2 and DNMT3A, as well as in novel candidates, observed in 2.1% of cases. In addition to coding genes, mutations in non-coding RNAs with potential driver role (e.g. seed region of mir-142) were detected in 3.3% of the cases.

We identified a total of 11,050 SV events, of which 48 disrupted common tumor suppressor genes, such as RUNX1, ETV6, CBL, and TP53. Chromothripsis events were detected in 5.3% of cases, which were highly enriched in TP53-mutated cases and frequently affected KMT2A, ETS1, ETV6, TP53, EPOR and ERG. Median of 4 chromosomes were affected in a single chromothripsis event.

We detected a total of 340 fusion events in 22.6% of all cases, of which 105 fusions involved previously known targets, whereas the remaining 235 represented non-recurrent in-frame gene fusions detected in 11.7% of cases. Although such fusion events were significantly associated with TP53 mutations, 5.6% of TP53-wild-type cases also had in-frame gene fusions, some of which were suggestive of oncogenic potential (e.g. KAT6A::NUTM1). Although low in frequency (1.1%), analysis of expression change revealed SVs affecting cis-regulatory elements including potential enhancer hijacking of a RAS pathway gene.

Conclusion: Through deep WGS of a large cohort of MN cases, we have delineated a comprehensive landscape of driver mutations and detected new mutational signatures and candidates for driver mutations, including those affecting non-coding regions, underscoring the importance of WGS for better understanding of the pathogenesis of MNs.

This content is only available as a PDF.
Sign in via your Institution